Algorithms for Optimal Replica Placement Under Correlated Failure in Hierarchical Failure Domains
نویسندگان
چکیده
In data centers, data replication is the primary method used to ensure avail-ability of customer data. To avoid correlated failure, cloud storage infras-tructure providers model hierarchical failure domains using a tree, and avoidplacing a large number of data replicas within the same failure domain (i.e.on the same branch of the tree). Typical best practices ensure that replicasare distributed across failure domains, but relatively little is known concern-ing optimization algorithms for distributing data replicas. Using a hierar-chical model, we answer how to distribute replicas across failure domainsoptimally. We formulate a novel optimization problem for replica placementin data centers. As part of our problem we formalize and explain a newcriterion for optimizing a replica placement. Our overall goal is to chooseplacements in which correlated failures disable as few replicas as possible.We provide two optimization algorithms for dependency models representedby trees. We first present an O(n+ρ log ρ) time dynamic programming algo-rithm for placing ρ replicas of a single file on the leaves (representing servers)of a tree with n vertices. We next consider the problem of placing replicasof m blocks of data, where each block may have different replication factors.For this problem, we give an exact algorithm which runs in polynomial timewhen the skew, the difference in the number of replicas between the largestand smallest blocks of data, is constant. A preliminary version of this work appeared in the Proceedings of the 9th Annual Inter-national Conference on Combinatorial Optimization and Applications (COCOA), 2015 [1].Email addresses: [email protected] (K. Alex Mills),[email protected] (R. Chandrasekaran), [email protected] (Neeraj Mittal)This work was supported, in part, by the National Science Foundation (NSF) undergrants numbered CNS-1115733 and CNS-1619197. Preprint submitted to Theoretical Computer ScienceJanuary 9, 2017arXiv:1701.01539v1[cs.DS]6Jan2017
منابع مشابه
Survivable Replica Placement in Tree-Based Dependency Models
In complex systems, such as data centers, component failure is ubiquitous. Studies have shown that many complex systems do not effectively mitigate the impact of correlated failures. A model for handling correlated failure based upon hierarchical dependencies between multiple points of failure is presented. The dependency model is generic in the sense that it makes no assumptions concerning the...
متن کاملReplica Placement Strategies for Wide-Area Storage Systems
Wide-area durable storage systems trigger data recovery to maintain target data availability levels. Data recovery needs to be triggered when storage nodes permanently fail, i.e., data is lost. Transient failures, where nodes return from failure with data, add noise to determining that a node has failed permanently. Replica placement strategies maintain data availability while minimizing cost, ...
متن کاملCorrelated Data Placement in Distributed Systems
In distributed systems, communication cost is one of the major concerns. Many existing research have been conducted on placement of data to reduce communication cost and improve performance in widely distributed systems. All these research works focus on independent data objects. However, data are correlated due to accesses from clients and the correlation has some impact on date placement. In ...
متن کاملOptimal replica placement in hierarchical Data Grids with locality assurance
In this paper, we address three issues concerning data replica placement in hierarchical Data Grids that can be presented as tree structures. The first is how to ensure load balance among replicas. To achieve this, we propose a placement algorithm that finds the optimal locations for replicas so that theirworkload is balanced. The second issue is how to minimize the number of replicas. To solve...
متن کاملImproving Data Availability Using Combined Replication Strategy in Cloud Environment
As grow as the data-intensive applications in cloud computing day after day, data popularity in this environment becomes critical and important. Hence to improve data availability and efficient accesses to popular data, replication algorithms are now widely used in distributed systems. However, most of them only replicate the static number of replicas on some requested chosen sites and it is ob...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1701.01539 شماره
صفحات -
تاریخ انتشار 2017